Sample Efficient Policy Search for Optimal Stopping Domains

نویسندگان

Karan Goel

Christoph Dann

Emma Brunskill

چکیده

Optimal stopping problems consider the question of deciding when to stop an observation-generating process in order to maximize a return. We examine the problem of simultaneously learning and planning in such domains, when data is collected directly from the environment. We propose GFSE, a simple and flexible model-free policy search method that reuses data for sample efficiency by leveraging problem structure. We bound the sample complexity of our approach to guarantee uniform convergence of policy value estimates, tightening existing PAC bounds to achieve logarithmic dependence on horizon length for our setting. We also examine the benefit of our method against prevalent model-based and model-free approaches on 3 domains taken from diverse fields.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sample Efficient Bayesian Optimization for Policy Search: Case Studies in Robotics and Education

In this work we investigate the problem of learning adaptive strategies, called policies, in domains where evaluating different policies is costly. We formalize the problem as direct policy search: searching the space of policy parameters to identify policies that perform well with respect to a given objective. Bayesian Optimization is one method suitable for such settings, when sample/data eff...

متن کامل

Optimal Stopping Policy for Multivariate Sequences a Generalized Best Choice Problem

In the classical versions of “Best Choice Problem”, the sequence of offers is a random sample from a single known distribution. We present an extension of this problem in which the sequential offers are random variables but from multiple independent distributions. Each distribution function represents a class of investment or offers. Offers appear without any specified order. The objective is...

متن کامل

Optimal Placement and Sizing of TCSC & SVC for Improvement Power System Operation using Crow Search Algorithm

Abstract: The need for more efficient power systems has prompted the use of a new technologies includes Flexible AC transmission system (FACTS) devices. FACTS devices provides new opportunity for controlling the line power flow and minimizing losses while maintaining the bus voltages within a permissible limit. In this thesis a new method is proposed for optimal placement and sizing of Thyristo...

متن کامل

Optimal design for multi-arm multi-stage clinical trials

In early stages of drug development there is often uncertainty about the most promising among a set of different treatments. In order to ensure the best use of resources it is important to decide which, if any, of the treatments should be taken forward for further testing. Multi-arm multi-stage (MAMS) trials provide gains in efficiency over separate randomised trials of each treatment. They all...

متن کامل

Optimal Search and Stop in Continuous Search Process

This paper investigates an optimal search policy with stopping for a stationary target being in one of n boxes. It is assumed that the search is conducted continuously with a total search cost C per unit time and the search in box i costs ci per unit search effort. The conditional probability of detecting the target with unit search effort is Oli and a reward Ri is given to the searcher when he...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Sample Efficient Policy Search for Optimal Stopping Domains

نویسندگان

چکیده

منابع مشابه

Sample Efficient Bayesian Optimization for Policy Search: Case Studies in Robotics and Education

Optimal Stopping Policy for Multivariate Sequences a Generalized Best Choice Problem

Optimal Placement and Sizing of TCSC & SVC for Improvement Power System Operation using Crow Search Algorithm

Optimal design for multi-arm multi-stage clinical trials

Optimal Search and Stop in Continuous Search Process

عنوان ژورنال:

اشتراک گذاری